A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

نویسندگان

Kou Tanaka

Tomoki Toda

Graham Neubig

Sakriani Sakti

Satoshi Nakamura

چکیده

This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding to EL speech as noise. To address these issues, there are mainly two conventional approached to EL speech enhancement through either noise reduction or statistical voice conversion (VC). The former approach usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter approach significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid approach using a noise reduction method for enhancing spectral parameters and statistical voice conversion method for predicting excitation parameters. Moreover, we further modify the prediction process of the excitation parameters to improve its prediction accuracy and reduce adverse effects caused by unvoiced/voiced prediction errors. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough. key words: speaking-aid, electrolaryngeal speech, spectral subtraction, voice conversion, hybrid approach

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Evaluation of a Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Prediction

An Evaluation of a Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Prediction Kou TANAKA†, Tomoki TODA†, Graham NEUBIG†, Sakriani SAKTI†, and Satoshi NAKAMURA† † Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma-shi, 630-0101, Japan E-mail: †{ko-t,tomoki,neubig,ssakti,s-nakamura...

متن کامل

A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion

We present a hybrid approach to improving naturalness of electrolaryngeal (EL) speech while minimizing degradation in intelligibility. An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produ...

متن کامل

Evaluation of Excitation Feature Prediction in a Hybrid Approach to Electrolaryngeal Speech Enhancement

We implement removing micro-prosody with low-pass filtering and avoiding Unvoiced/Voiced (U/V) prediction as part of a hybrid approach to improve statistical excitation prediction in the hybrid approach to electrolaryngeal (EL) speech enhancement. An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient larynge...

متن کامل

Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement

Electrolaryngeal (EL) speech produced by a laryngectomee using an electrolarynx to mechanically generate artificial excitation sounds severely suffers from unnatural fundamental frequency (F0) patterns caused by monotonic excitation sounds. To address this issue, we have previously proposed EL speech enhancement systems using statistical F0 pattern prediction methods based on a Gaussian Mixture...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEICE Transactions

دوره 97-D شماره

صفحات -

تاریخ انتشار 2014

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

نویسندگان

چکیده

منابع مشابه

An Evaluation of a Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Prediction

A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion

Evaluation of Excitation Feature Prediction in a Hybrid Approach to Electrolaryngeal Speech Enhancement

Physically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement

Speech Enhancement using Adaptive Data-Based Dictionary Learning

عنوان ژورنال:

اشتراک گذاری